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Abstract 

The sheer amount of information about potential adverse drug events published in 
medical case reports pose major challenges for drug safety experts to perform timely 
monitoring. Efficient strategies for identification and extraction of information about 
potential adverse drug events from free-text resources are needed to support 
pharmacovigilance research and pharmaceutical decision making. Therefore, this work 
focusses on the adaptation of a machine learning-based system for the identification 
and extraction of potential adverse drug event relations from MEDLINE case reports. It 
relies on a high quality corpus that was manually annotated using an ontology-driven 
methodology. Qualitative evaluation of the system showed robust results. An 
experiment with large scale relation extraction from MEDLINE delivered under- 
identified potential adverse drug events not reported in drug monographs. Overall, this 
approach provides a scalable auto-assistance platform for drug safety professionals to 
automatically collect potential adverse drug events communicated as free-text data. 



Background 

Adverse drug effects are a very serious issue that confronts patients, healthcare providers, 
regulatory authorities and drug manufacturers. While stringent measures for detecting 
risks associated with drug usage are clinical trials, the wide field usage might show addi- 
tional risks non detectable in the clinical trials due to the limited number of patients 
involved. After the marketing approval, undesired effect of drugs are reported to the 
authorities using so called Spontaneous Adverse Event Reporting Systems, that are then 
timely analyzed to ensure safe use of drugs [1]. A well known problem of pharma- 
covigilance is however the under reporting, namely the low number of reports that the 
Authorities receive. Case reports published in the scientific biomedical literature repre- 
sent an important resource complementary to the SAERS due to their abundant existence, 
rapid rate of generation, and valuable information enclosed [2]. Due to their unstructured 
nature, manual analysis of the scientific literature is challenging, cumbersome, and labor 
intensive. In recent years, development of automatic natural language processing (NLP) 
and information extraction (IE) techniques have gained large popularity. They include 
identification of biomedical named entities, relations between the entities, or events 
associated with them. Noticeable efforts have been invested on mining the potential 
adverse drug events in different forms of free-text data. Examples include Wang et. al. [3] 
who applied the MedLEE system on discharge summaries to identify medication events 
and entities that could be potential adverse drug effects; these were detected using the 
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strength of statistical association based on their co-occurrences. Leaman et al [4] pro- 
posed a lenient NLP model for extracting adverse effects of drugs from social media such 
as blogs. Gurulingappa et. al. [5] developed a machine learning-based system for classi- 
fying the sentences in MEDLINE case reports that assert potential adverse drug events. 
However, according to the authors knowledge, there is a limited focus on identification 
of semantic relationships between drugs and adverse events in text. This is partly due 
to the unavailability of suitable open access corpora that could be used for technology 
development and benchmarking. Extracting relations between drugs and adverse effects 
can facilitate appropriate indexing, precise searching, visualization, faster information 
tracing and improve sensitivity of signal detection in pharmacovigilance. The use of ontol- 
ogy of adverse drug events for automated signal generation in pharmacovigilance has 
already been proposed [6] and its application to information retrieval has been exploited 
by the same group few years later in the VIGITERMES project [7]. There, the OntoEIM 
adverse event ontology was used to extend the dictionary of adverse event entities, nor- 
malize queries, and consolidate annotations, achieving 29% precision and 67% recall on 
MEDLINE abstracts. Automatic extraction of potential adverse drug events from clinical 
records is an active area of research [8]. Mining social internet message boards to identify 
potential adverse drug events has been reported [9], whereby in that work the extraction 
of drug-event pairs was determined only using co-occurrence of terms within a window 
of 20 tokens apart, and the use of machine learning systems was only focused on de- 
identification for privacy protection. This work reports on the adaptation of a machine 
learning-based system for identifying the relations between drugs and adverse effects 
in MEDLINE case reports; it relies on an ontology-driven manually annotated corpus 
that strictly follows semantic annotation guidelines developed for clinical text [10]. The 
system has been qualitatively evaluated and studied for its ability of support real time 
pharmacovigilance studies. 

Methods 

Corpus preparation 

The data set used for training and validation of the relation extraction system is the ADE 
corpus [11]. The ADE corpus contains 2972 MEDLINE case reports that are manually 
annotated in duplicate and harmonized by three annotators. The corpus contains anno- 
tations of 5063 drugs, 5776 conditions (e.g. diseases, signs, symptoms), and 6821 relations 
between drugs and conditions representing clear adverse events. All annotations are con- 
fined to sentence level i.e. drugs and conditions representing adverse events co-occurring 
only within individual sentences are annotated. Drugs and conditions that are not part of 
a potential adverse event relation are not annotated. This was done in accordance to the 
annotation guidelines. The ADE corpus contains annotations of relations between drugs 
and conditions that represent True relations. This represents a sparsely annotated dataset. 
For training a supervised classifier, it was essential to generate False relations i.e. drugs 
and conditions that do not fall into adverse effect relations but that are still within the 
same sentence. For this purpose, ProMiner, a dictionary-based named entity recognition 
system [12] was employed. ProMiner was incorporated with DrugBank [13] and MedDRA 
[14] dictionaries for the identification of drugs and conditions respectively in the ADE 
corpus that were previously not annotated by human annotators. As a result of named 
entity recognition, new instances encompassing 2269 drugs and 3437 conditions were 
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automatically annotated. Drug-condition pairs co-occurring within sentences that were 
previously not annotated by humans formed False relations. Altogether, 5968 False rela- 
tions were automatically generated. The corpus enriched with machine annotated drugs, 
conditions, and relations between them is referred as ADE-EXT (indicating extended 
ADE corpus). Figure 1 shows an illustration of True and False relations between drug and 
conditions co-occurring within a sentence. 

In the ADE-EXT corpus, 120 manually annotated True relations were not suitable for 
the NLP task. Examples include overlapping inter-related entities such as acute lithium 
toxicity where lithium is related to acute toxicity. After removal of nested annotations, the 
ADE-EXT corpus was decomposed into a training set (ADE-EXT-TRAIN) and a test set 
(ADE-EXT-TEST). Counts of entities and relations in subsets of ADE-EXT corpora are 
shown in Table 1. 

Relation extraction workflow 

For the identification and extraction of drug-condition entity pairs that constitute a 
potential adverse event relation, the Java Simple Relation Extraction (JSRE) system [15] 
was employed. JSRE provides a re-trainable and scalable supervised classification plat- 
form that uses Support Vector Machines (SVMs) [16] with different kernels specially 
designed for the NLP and relation extraction. All sentences in ADE-EXT-TRAIN and 
ADE-EXT-TEST containing drug-condition pairs labelled as either True or False were 
transformed into the SRE format before subjecting them to relation extraction. The SRE 
format is a unique way of representing data within the JSRE platform where tokens 
appearing in sentences are enriched with their parts-of-speech tags, lemmas, and flags 
indicating if a token is a part of named entity or not. Amongst different kernels available, 
the shallow linguistic kernel was thoroughly used since it has been widely applied and has 
shown success during similar relation extraction tasks [17]. The ADE-EXT-TRAIN was 
used as data for training and cross-validation of JSRE whereas the ADE-EXT-TEST was 
used as an independent test set. 

Mapping annotation ontology against ontology of adverse events 

The Clinical E-Science Framework (CLEF) initiative [18] investigated how to gener- 
ate semantically annotated medical corpora for information extraction. As described by 
Gurulingappa et. al. [11], we adopted the standard established by the CLEF framework for 
the annotation workflow [10] however we reshaped the annotation schema by using only 
two of the original entities (condition, drug) and extended it with a third one (dosage). 
None of the relationships used by the CLEF annotation schema could be reused for our 
work, since the CLEF annotation schema did not consider adverse drug events, instead we 
created two relations: drug-cause-condition, drug-has-dosage. In this work we focused 



A 60 year-old woman with chronjcjgna^ajl^ developed acute proximal muscle weakness 
after receiving a regular dosage of colchicine . [PMID: 10904571] 
colchicine -> acute proximal muscle weakness (True) 
colchicine -> chronic renal failure (False) 

Figure 1 Example of an annotated sentence in the ADE corpus. Example of a sentence annotated with 
drug, conditions, and relations between them in the ADE corpus. True indicates presence of adverse effect 
relation and False indicates absence of adverse effect relation. 
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Table 1 Counts of entities and relations in ADE-EXT corpus subsets 



Corpus 


ADE-EXT-TRAIN 


ADE-EXT-TEST 


Documents 


1884 


210 


Drugs 


6770 


758 


Conditions (adverse effect) 


8539 


978 


Sentences 


5333 


606 


True Relations 


6030 


671 


False Relations 


4799 


546 



only on automating the detection of drug-cause-condition thus dosage will not be men- 
tioned further. The ADE corpus has been created using the Knowtator plugin for Protege 
[19], an ontology-driven corpus annotation tool also used for the creation of the CLEF 
corpus. Although we adopted the same tool used in CLEF and also adopted the standard 
established by the CLEF framework for the annotation workflow, we could not adopt the 
same annotation ontology since the latter was not able to capture drug-adverse event and 
drug-dosage relations. The annotation ontology described above was therefore used to 
create the ADE corpus. Subsequent to the corpus creation, the realism-based biomedi- 
cal ontology for representation of adverse events (OAE) has been published [20]. OAE 
has been developed following the principles of Ontological Realism, thus is aligned with 
the Basic Formal Ontology and the Relation Ontology, and with the Open Biological and 
Biomedical Ontologies (OBO) Foundry principles of openness, collaboration and use of 
a common shared syntax. OAE has 484 representational units, annotated by means of 
369 terms with specific identifiers and 115 terms imported from existing ontologies. The 
use of ontologies has proven of great value in biomedicine, also since it enable machine 
reasoning, abstraction and automatic hypothesis generation. We therefore had interest in 
investigating if the knowledge encoded in the annotations of the ADE corpus could be 
semantically connected to the OAE. For doing this, we manually compared the definitions 
of the entities of OAE and of ADE annotation ontology. Figure 2 shows the basic design 
patterns of OAE, ADE and CLEF as from the original papers, emphasizing shared entities 
using green and red colors. 

Results and discussion 

Performance evaluation criteria 

The performance of relation extraction was evaluated by 10-fold cross-validation of 
the training data. During cross-validation of the training data and final evaluation over 
the test set, classification performances were assessed using the F-score over True- 
labelled relations since they represent potential adverse event relations between drugs 
and conditions that denote a focused relation class being studied. 

Assessment of relation extraction 

Baseline experiments began with training and cross-validation of JSRE over the ADE- 
EXT-TRAIN corpus. Results of systems performances are shown in Table 2. The system 
achieved an overall F-score of 0.87 after cross-validation. Upon the final test over 
ADE-EXT-TEST, the system attained F-score of 0.87 indicating a consistency in classifi- 
cation. A subset of instances misclassified during the cross-validation and testing were 
manually investigated to understand the common sources of errors. Limited context 
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Drug-or-device 

hasjndication 



(^^Negatic 



Intervention 



hasjndication 



hasjinding 



hasjndication 



Investigation 



hasjinding 




CLEF annotation ontology 



<^^ateralityj^> 



modifies 



(^jub-locati orT^) 



Condition 



induced_by 



\ Drug 





used_at 


Dosage 



ADE annotation ontofogy 



Adverse event 



Patient 



Time 



Time 

t 

occurs_ai has_participant has_pariicipant occurs_at 

Medical 

,nduced_by > intervention 



Process 



Ontoiogy of Adverse Events 



Figure 2 Ontologies discussed in this work. Mappings between ADE, OAE, and CLEF ontologies have been 
shown. Identical entities are in boxes with same colours. Condition in the CLEF ontology is mapped to Process 
in the OAE. 



appeared to be one reason for misclassification. For example, the title Niacin maculopathy 
(PMID:3 174043) infers maculopathy as a potential adverse event of niacin that lacks con- 
textual description to support machine classification. Distantly co-occurring inter-related 
entities constituted couple of errors. For example, in the sentence CASE SUMMARY: 
A 65-year-old patient chronically treated with the selective serotonin reuptake inhibitor 
(SSRI) citalopram developed confusion, agitation, tachycardia, tremors, myoclonic jerks 
and unsteady gait, consistent with serotonin syndrome, following initiation offentanyl, and 
all symptoms and signs resolved following discontinuation offentanyl (PMID: 1738 1671); 
the relation between confusion and the last appearing drug name fentanyl was incorrectly 
classified. Case reports often contain frequencies at which potential adverse events were 
observed. For instance, The toxic effects of methotrexate included elevated liver transam- 
inases (3/4), nausea (2/4), abdominal pain (2/4), bone pain (2/4), mild neutropenia (1/4), 
and mild pruritus (1/4) (PMID:433855); this sentence shows examples of relations where 
the system had difficulties in identification of correct relations. Potential adverse drug 
events are categorized according to their severity: serious suspected adverse drug reac- 
tions require immediate action by medical professionals. Manual investigation of the 
predicted results showed that the system was able to capture most of the serious poten- 
tial adverse events. These findings demonstrate the potential of this approach to facilitate 



Table 2 Assessment of results of relation extraction 



Evaluation 


Precision 


Recall 


F-score 


Cross-Validation 


0.87 


0.86 


0.87 


Final Test 


0.86 


0.89 


0.87 
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the identification of potential signals from case reports, of great interest for drug safety 
experts. 

Impact of size of the training set on the performance 

In order to study the impact of size of the training data on performance of classifica- 
tion, the ADE-EXT-TRAIN was decomposed into random subsets containing 10, 20, 
50, 100, 200, 500, 1000, and 2000 documents. The JSRE was trained over these subsets 
independently in different rounds and evaluated by 10-fold cross-validation. Table 3 
shows that already using 200 documents one could achieve performances over the 80% 
range. Whereby, to obtain a classifier with a standard deviation of 1%, one needs a 
substantially large training data. 

Mapping the ADE annotation ontology to the ontology of adverse events 

As clearly shown in Figure 2, both the ADE annotation ontology and OAE represent 
adverse drug reactions using formal ontological methods. In spite of this common goal, 
the two ontologies use different naming for the two core entities: a Condition in the 
ADE annotation ontology coincide with a drug adverse event in OAE; a Drug in the ADE 
annotation ontology coincide with a drug- administration in OAE. The ADE ontology 
additionally introduce the entity dosage, not specified in OAE at the time of its devel- 
opment since OAE originally focused on vaccines for which dosing is not an essential 
medical concept. Both ADE and OAE model a causal relationship between Condition or 
Adverse event and Drug or Medical intervention, with the latter being the causal source. 
The only entity shared by the CLEF annotation ontology with OAE and ADE is the 
Drug-or-device, that coincide with a Drug or Medical intervention. 

Use case study: large scale relation extraction 

An experiment was conducted in order to understand the real-world use case scenar- 
ios for the extraction of potential adverse drug events from text. This was performed 
by applying the trained extraction tool to the whole MEDLINE and thereafter com- 
paring them to the information present in drug leaflets present in the SIDER [21] 
database. Some of the automatically extracted potential adverse drug events, not present 



Table 3 Impact of size of the training set on relation extraction 







Precision 




Recall 




F-score 


N 


Mean 


SD 


Mean 


SD 


Mean 


SD 


10 


0.58 


0.41 


0.60 


0.44 


0.55 


0.38 


20 


0.62 


0.36 


0.69 


0.38 


0.64 


0.37 


50 


0.79 


0.13 


0.87 


0.06 


0.82 


0.09 


100 


0.81 


0.05 


0.75 


0.08 


0.78 


0.04 


200 


0.85 


0.07 


0.84 


0.05 


0.84 


0.04 


500 


0.82 


0.04 


0.85 


0.01 


0.84 


0.02 


1000 


0.83 


0.02 


0.87 


0.02 


0.86 


0.01 


2000 


0.87 


0.01 


0.86 


0.01 


0.87 


0.01 



Impact of size of the training set on relation extraction was measured by independent cross-validations over subsets of the 
ADE-EXT-TRAIN corpus. N indicates number of documents in the training set and SD indicates the standard deviation measured 
during the 10-fold cross validation. 
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in SIDER, were manually investigated for their validity by comparison to the Medicines 
and Healthcare products Regulatory Agency (MHRA) drug label changes reported in 
2009. 

Relation extraction from MEDLINE 

MEDLINE articles published before 2009 were gathered to form a Medline-2009 cor- 
pus. ProMiner was equipped with DrugBank and MedDRA dictionaries for tagging drugs 
and conditions occurring in sentences of Medline-2009. A JSRE model trained over the 
ADE-TRAIN-EXT corpus was applied for classification of relations between drugs and 
conditions as True or False where a True relation indicates potential drug-related adverse 
event. As a result of relation extraction, 165680 relations were extracted between 1611 
drugs and 5079 adverse effects where drugs and adverse effects were normalized to 
DrugBank and MedDRA respectively. 

Adverse effect extraction from SIDER 

Side Effect Resource (SIDER) is a database of adverse drug effects that links 888 drugs 
to 1450 adverse effects. It has been constructed manually from the summary of product 
leaflets of each drug. Drugs and their adverse effects were extracted from SIDER version 
1.01 that contains drug leaflets published before 2009. 

MHRA drug label changes 

In 2009, the MHRA proposed safety label updates for 26 drugs. These were of course 
not all the safety label updates that the MHRA identified in 2009, but those that MHRA 
decided to give particular visibility through their web site. These new adverse drug effects 
were manually extracted and they serve as a standard reference for validation of potential 
adverse drug events automatically extracted from Medline-2009 using the JSRE trained 
method. 

Validation of large scale relation extraction 

From the MHRA label change dataset, three drugs were arbitrarily chosen for deeper 
investigation. They are Rituximab, Efalizumab, and Natalizumab: three anti-neoplastic 
and immunomodulatory monoclonal antibodies. For the three drugs of interest, potential 
adverse drug events were selected from the Medline-2009 predictions and SIDER. Poten- 
tial adverse drug events extracted from Medline-2009 that are not reported in SIDER were 
manually checked against the label changes of MHRA. 

Manual investigation of machine predicted potential adverse events showed that the 
system was able to capture valid potential adverse events from free-text that were not 
yet reported in product leaflets (Table 4). These adverse effects were later updated on 
drug labels by the UK regulatory authorities. This instance provides a good example for 

Table 4 Potential adverse drug events extracted from MEDLINE not reported in drug 



leaflets until 2009 and later introduced in package leaflets 



Drug 


Adverse effect 


Rituximab 


Progressive multifocal leukoencephalopathy 


Efalizumab 


Progressive multifocal leukoencephalopathy 


Natalizumab 


Hypersensitivity 
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how the developed framework can help in capturing potential adverse drug events from 
literature and therefore support pharmacovigilance. 

Conclusions 

This work reports on the adaptation of a machine learning-based JSRE system for the 
identification and extraction of potential adverse events of drugs in scientific case reports. 
A methodology has been discussed to enrich a sparsely annotated corpus and its subse- 
quent use to build classification models. Evaluation of the systems performance showed 
promising results. A use-case study performed on relation extraction from large scale lit- 
erature showed the systems ability to capture valid, under-reported, and novel potential 
adverse events not yet present in product leaflets. 

The performance of the system can be improved in several ways. In the current exper- 
iments, only the default features acceptable by JSRE were used. Optimization of feature 
representation to include additional features for instance from syntactic sentence parse 
trees may further improve the results. Development of additional strategies like post- 
processing to classify relations with missing contextual descriptions can help to recover 
more relations. Furthermore, extension of handling inter-sentence relations needs to be 
considered in order to further increase coverage. 

The reported experimental results denote the research status on identification from 
text of potential adverse drug events. There are several strategies that are being followed. 
The authors plan to benchmark the performances of several named entity taggers against 
the ADE corpus for the identification of drugs and conditions mentions in text. The 
current experiments have been performed on the ADE corpus, since that was the only 
one available when this work was done, however while writing this report a new cor- 
pus has been published, namely the EU-ADR corpus [22]. It will be interesting to see if 
the performance of JSRE on the ADE corpus will be different compared to the EU-ADR 
corpus. 

Similarly, benchmarking results of public and commercial relation extraction systems 
will be performed [23] and the practical impact of the information extracted from text on 
predicting drug label changes will be studied in detail. 

The use of ontologies for driving information extraction has been reported [24,25] . We 
plan to explore the use of various available tools (e.g. ODIE, OBCIE,semantixs) using the 
OAE ontology and compare the performance of the ontology driven / based methods for 
information extraction against the method presented here. 

The current work has demonstrated promising results, it has the potential to reduce the 
manual reading time, improve the quality of the signal detection process, and therefore 
positively contribute to safer use of drugs to the benefit of patients and society. We spec- 
ulate that this work could also pave the road to pharmacovigilance applications on social 
media and multimedia sources too. 
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